Spatial variation and risk factors of malaria and anaemia among children aged 0 to 59 months: a cross-sectional study of 2010 and 2015 datasets

Malaria and anaemia are common diseases that affect children, particularly in Africa. Studies on the risk associated with these diseases and their synergy are scanty. This work aims to study the spatial pattern of malaria and anaemia in Nigeria and adjust for their risk factors using separate models for malaria and anaemia. This study used Bayesian spatial models within the Integrated Nested Laplace Approach (INLA) to establish the relationship between malaria and anaemia. We also adjust for risk factors of malaria and anaemia and map the estimated relative risks of these diseases to identify regions with a relatively high risk of the diseases under consideration. We used data obtained from the Nigeria malaria indicator survey (NMIS) of 2010 and 2015. The spatial variability distribution of both diseases was investigated using the convolution model, Conditional Auto-Regressive (CAR) model, generalized linear mixed model (GLMM) and generalized linear model (GLM) for each year. The convolution and generalized linear mixed models (GLMM) showed the least Deviance Information Criteria (DIC) in 2010 for malaria and anaemia, respectively. The Conditional Auto-Regressive (CAR) and convolution models had the least DIC in 2015 for malaria and anaemia, respectively. This study revealed that children in rural areas had strong and significant odds of malaria and anaemia infection [2010; malaria: AOR = 1.348, 95% CI = (1.117, 1.627), anaemia: AOR = 1.455, 95% CI = (1.201, 1.7623). 2015; malaria: AOR = 1.889, 95% CI = (1.568, 2.277), anaemia: AOR = 1.440, 95% CI = (1.205, 1.719)]. Controlling the prevalence of malaria and anaemia in Nigeria requires the identification of a child’s location and proper confrontation of some socio-economic factors which may lead to the reduction of childhood malaria and anaemia infection.


Data and method
Studied data. This study used the 2010 and 2015 data collected in the Malaria Indicator Survey (MIS) carried out in Nigeria. In both years, the sampling frame was obtained from the 2006 Population and Housing Census of the Federal Republic of Nigeria that was conducted by the National Population commission 5,28 . Nigeria as a nation is divided into 37 states administratively including the capital territory and each state is divided into local government areas (LGAs) then each LGA is further divided into localities. For convenience, each locality was subdivided into census enumeration areas (EAs). These EAs from the 2006 EA census frame were used to define the cluster (i.e. primary sampling unit (PSU)) 5 .
A two-stage probability sampling was assumed in 2010 and 2015. While in 2010, the two-stage cluster design has a total number of 240 clusters, 83 in the urban areas and 157 clusters in the rural areas. In the end, 239 clusters were used due to intercommunal uproar in one of the clusters. Approximately, a representative sample of 6000 households was selected for the survey, with a minimum target of 920 completed individual women's interviews per zone. This is for the first stage. In the second stage, by equal probability systematic sampling, an average of 26 households were selected in each cluster. All women from age 15-49 were interviewed, also, children from 6 to 59 months were tested for malaria and anaemia 28 . On the other, in addition to Federal Capital Territory (FCT), 9 clusters (EAs) were selected from each state. Each state was represented in the sample with a total of 333 clusters around the country, 138 in urban areas and 198 in rural areas. From each cluster, 25 households were selected in the second stage by equal probability systematic sampling, while all women age 15-49 were interviewed, and all children age 6-59 months were tested for malaria and anaemia 5 .
Furthermore, a comprehensive inventory of households was carried out in both years. In 2010, the mapping exercise was done from August to September while in 2015 it was carried out from June to July 5,29 . In addition, www.nature.com/scientificreports/ Dependent variables. The dependent variables used in this study are the malaria status (presence or absence) and anaemia status (presences of absences) variables. Malaria is a disease spread to a human through the bite of infected anopheles' mosquito. The presence of malaria antigens discharged from the parasitized red blood cells is detected by the malaria diagnostic test which is a form of immunochromatography test. Both rapid diagnostic and microscopy testing have been approved by World Health Organization (WHO) as procedures for malaria diagnosis. Even though microscopy is recognized as the standard approach for malaria diagnosis, the application is demanding. While microscopy requires an experienced microscopist, a good environment, time etc., RDTs do not need skilled personnel, specialized equipment and long process 2 . Anaemia is a condition resulting from the decrease or dysfunctional red blood cells in the body. Iron deficiency is known as a common cause of anaemia but in developing countries, malaria as one of the infectious diseases is attributed to anaemia disease 5 . As recommended by WHO, children from age 6 to 59 months are said to be anaemic if the Hb concentration level is below 11.0 g/dl: those within age 5 to 11 years are anaemic if Hb level is below 11.5 g/dl and children from age 12 to 14 years are considered anaemic if Hb is below 12.0 g/dl 30 . The cause of anaemia is dependent on the part of the world a child lives.
Timely diagnosis and immediate treatment have been advised by WHO as major strategies in managing malaria and anaemia and in reduction of high mortality in most prevalent regions. Considering the strong correlation between malaria infection and anaemia, both microscopy and Rapid Diagnostic Tests were approved for the diagnosis of the two diseases in field surveys 31 . In this work, the upshot of interest was basically on the result of malaria rapid test and anaemic status as binary indicators of the presence of malaria and anaemia in a child's blood sample respectively, where 1 denotes the presence of malaria or anaemia and 0 otherwise. The yearly distribution of malaria and anaemia during the last two weeks before the interview is as follows; in 2010, 5056 and 5147 were tested for anaemia and malaria, where for anaemia, 3512 children tested negative and 1544 tested positive while for malaria, 2719 tested negative and 2428 tested positive. While in 2015, 6021 and 6025 were tested for anaemia and malaria respectively, this has in the record that 4062 tested negative and 1959 tested positive for anaemia, on the other hand, 3399 tested negative and 2626 tested positive for malaria.   www.nature.com/scientificreports/ Independent variables. In this study, the independent variables considered included some demographic, socio-economic and geographical variables which were based on the previous studies 2, 11,32 . These variables comprise type of place of residence, source of drinking water, type of toilet facility, presence of electricity, own a radio, own a television, main floor material, main wall material, main roof material, wealth index, child's age in months, sex, mother's highest educational level while state and region were included as geographical variables. The selected variables in Table 1 were based on DHS and MIS data sets as well as relevant literature.

Methods.
Here, we explained the Bayesian spatial models used to estimate the spread and the risk factors of malaria and anaemia in the 37 states including the Federal Capital in Nigeria.
Spatial model. Descriptive statistics approach was used to analyze the independent variables of the study sample in the form of descriptive table and simple percentages, while each of the dependent variables was described  www.nature.com/scientificreports/ in the form of maps. For malaria or anaemia amongst children under age 5 years, their respective relationships with the independent variables were tested using the chi-square association analysis. Thereafter, a binary logistic regression was used to study the relationship between the independent variables and malaria or anaemia. A stepwise backward selection was done to pick the factors that have significant relationship with malaria or anaemia amongst children under 5 years. While for adjustment of clustering, common causes and sampling weights were carried out using the already weighting factors constructed by measure DHS. For the spatial study of malaria and anemia data, let y ik be a binary malaria or anaemic status of child i, i = 1, 2, . . . , n k where n k is the number of children in state k , k = 1, 2, . . . , 37 that have malaria or anaemia. Then the binary response follows as: The y ik is assumed to follow a Bernoulli distribution with likelihood function defined as where θ ik = P y ik = 1 are unknown probabilities and E y ik = θ ik is related to predictor through a link function the vector x ik = 1, x ik1 , . . . , x ikp ′ are categorical and continuous variables and the vector of regression coefficients is β = β 0 , β 1 , . . . , β p . This model accepts only a parametric form of categorical variables. To account for more flexible approach, the linear predictor η ik of Eq. (1) will be extended. Therefore, the model complexity is increased by including different forms of variables. The logistic regression model is then extended to give room for area-specific random effects by substituting the linear predictor η ik in Eq. (1) with a geoadditive predictor. These random effects are put in the model to take care of extra variation. Consequently, to incorporate unobserved influential factors that changes across the states, the structured random effects is accounted for by the model and it is defined as: On a general note, the spatial effects of an areal unit can be modelled using the CAR model. Particularly in the second stage of hierarchical models, they are used to specify some classes of the model. in this work, the CAR model is expressed as follows. Let u = (u 1 , . . . , u n ) be the vector of univariate random variables in relation to the observed spatial unit understudy and represent {∂(i) : i = 1, . . . , n} as the states sharing the same border with state i . This means, for any i, k = 1, . . . , n, k ∈ ∂(i) if only i ∈ ∂(k) and i / ∈ ∂(i) must be satisfied. Therefore, suppose the conditional density of u i , i = 1, . . . , n, follows the conditional normal variable defined as; where u k is the mean for state k, µ i is the spatial trend at location i and d 2 i = ρ 2 u /∂(i) is the conditional variance of the i th state, which depends on the number of neighbours. Therefore, the size of the variance for the current state is determined by the number of state neighbours. c ik is the spatial dependence parameters for i = 1, . . . , n, such that c ii = 0 for all i 's while ρ 2 u denotes the variance parameter that controls the differences between spatial similarity? Particularly, the quantity c ik captures spatial dependency. The matrix form of Eq. (3) is given by Cressie 33 which denotes the joint distribution of (4) is correct if B can be inverted and B −1 L is symmetric and the conditional constraints c ik d 2 k = c ki d 2 i for all i = k , and must be positive definite. The elements of invertible matrix B is expressed as; To get a valid joint distribution, the covariance matrix in Eq. (4) must be symmetric and positive definite as mentioned above. Then the symmetric weighted adjacency matrix will be W = (W ik ) , and set c ik = ∅W ik where and the properness of the distribution is controlled by the parameter ∅.
Also, to account for unobserved heterogeneity within each state, the unstructured random effects is considered, and the model is expressed as:  34 is the most popularly used tool under the spatial Bayesian hierarchical models for disease mapping. The BYM comprises two random components, i.e., spatially structured u and spatially unstructured v components, which are included in the log-linear model for relative risk. By inclusion of these random effects, the smoothing of the relative risk at the state level is guaranteed. While spatially structured component u is correlation of neighbouring spatial units, the spatially unstructured component v is the uncorrelated extra variation. Also, note that the vectors u and v have individual unit random effects u i (i = 1, . . . , n) and v i (i = 1, . . . , n) respectively. Therefore, Eq. (3) is extended to convolution model by adding both structured and unstructured random effects as follows: ′ is a k-dimensional row-vector of covariates with β as the corresponding vector of regression coefficients. u i is the spatially structured random effect (correlated heterogeneity) and v i is the spatially unstructured effect (uncorrelated heterogeneity). On assumption 34 , stated that two random effects are independent and need a specification of independent priors. For the spatially unstructured v i , the priors distribution model is assumed to follow a normal distribution with a vector of mean 0 and a variance-covariance matrix σ 2 I , where I is the identity matrix and σ 2 > 0 is unknown. Following the argument of 34 , the prior of the spatial component is assumed to be represented by a Markov Gaussian field or conditional Gaussian autoregressive model. Therefore, excluding the kth state, let u −k denote the vector of effects, and then we assume that where n k is the number of neighbourhoods of state k , while e ∼ k are all units e neighbourhoods of state k and τ u is the standard deviation parameter. Finally, the inverse gamma hyperpriors is assumed for the variance of the normal priors. The posterior distributions of the parameters were estimated using Integrated Nested Laplace Approximation (INLA) 35 in R. This is because it is a better approach compared to Markov Chain Monte Carlo sampling and approximate Bayesian inference 36 . The Deviance Information Criteria (DIC) is calculated as; DIC = D(θ) + Dp , where D is the posterior mean of the deviance that measures the goodness of fit while Dp is the effective number of parameters in the model. Based on the DIC , the final model was selected and the model with the smallest DIC was taken as a better fit 37 . Ethics approval and consent to participate. Secondary data was used for this study. The data is available for research purposes from the Demography and Health Survey website. Therefore, formal ethical approval is not applicable for this study. All methods were carried out in accordance with relevant guidelines and regulations.

Results
Continuous variables with non-linear effect were studied; nonetheless, age in months was the only variable that showed a significant non-linear effect on the log-odds of a child's Malaria result and anaemic status. Therefore, this is the only non-linear effect incorporated in the fitted model, while the other independent variables were added as a linear fixed effect. Table 2 presents the 2010 and 2015 percentages of children examined for malaria and anaemia. Individual data records of malaria and anaemia were constructed two weeks before the interview for 5147 and 5056 children between the age 0 to 5 years old in 2010, respectively. The same was done in 2015 with 6025 children for malaria and 6021 children for anaemia. In child's age in months, group 6 (age 51-59 months) had the highest record of the two diseases in both years and it was more in male children (2010: anaemia-50.5%, malaria-50.6%. 2015: anaemia-50.4%, malaria-50.4%). Most of these children lived in rural areas where there is no electricity, no television, no good toilet facilities, and the source of drinking water is not healthy. Regarding region, in both years, Northwest had the highest record of the two diseases with majority of illiterate mothers (2010: anaemia-45.6%, malaria-46.1%. 2015: anaemia-43.5% and malaria-43.5%) and low standard of living, also poor building materials. Table 3 Table 4 presents the adjusted posterior odds ratio estimates (AOR) and 95% credible interval for the random effects included in the Bayesian hierarchical logistic regression model. In 2010, there was a significant increase in the odds of malaria for children aged 2 (15 to 23 months), 3 (24 to 32 months), 4 (33 to 41 months), 5 (42 to www.nature.com/scientificreports/ 50 months) and 6 (51 to 59 months) relative to children in aged 1 (6 to 14 months). There is a significant increase in the odds of malaria among children who have anaemia or reside in rural areas 2 . On the other hand, the household that has electricity had significantly lower odds of malaria compared to a household with no electricity. In the same vein, the odds of malaria decrease significantly among those who use well water relative to those who use tap/other and those who use pit toilet relative to those who use flush/other toilet. Furthermore, lower odds of www.nature.com/scientificreports/ malaria were suggested for mothers with higher educational level (Secondary and Higher education) and wealth index (Rich). Female children had an increased odds of malaria than male children; however, these odds were not significant; the same applies to main wall material and households that had a radio. With respect to main roofing material, main floor material and households that had a television, there was no significant decrease in odds of malaria. While in 2015, the odds of malaria for children aged 2 (15 to 23 months), 3 (24 to 32 months), 4 (33 to 41 months), 5 (42 to 50 months) and 6 (51 to 59 months) shows a significant increase relative to children aged 1 (6 to 14 months). Similarly, there is a significant increase in the odds of malaria among children who have anaemia or live in rural areas with zinc/metal roof. On the other hand, the odds of malaria decrease significantly among those who use pit toilet relative to those who use flush/other toilet. Furthermore, the results suggested that children from mothers with higher educational level (Secondary and Higher education) and wealth index (Middle and Rich) have lower odds of malaria. Female children had higher odds of malaria than male children; however, these odds were not significant; the same applies to electricity, main floor material, main wall material and household that has a radio. There was a non-significant increase in odds of malaria with respect to source of drinking water and household that has television. Figures 3 and 4 show the estimated mean, median, 25% quantile and 95% quantile of the structured and unstructured spatial effects on the log-odds of malaria, respectively. Lower odds of malaria are associated with the eastern regions because they have a negative spatial effect, while higher odds of malaria are associated with the northern regions because they have a positive spatial effect. This could be attributed to several factors among them being that northern women are less educated and therefore are not well informed about this disease. Also, most northern women are rural dwellers and so lack the knowledge of malaria protection role of modern housing. In comparison, the structured spatial effect that ranged from − 0.6 to 1.0 for mean, − 0.6 to 1.2 for median, − 1.4 to 0.0 for 25% quantile and 0.0 to 2.0 for 95% quantile is seen to be stronger than unstructured spatial effect that ranged from − 0.8 to 0.8 for mean, − 0.8 to 1.0 for median, − 1.6 to 0.0 for 25% quantile and 0.0 to 1.8 for 95% quantile. From this, the structured spatially correlated effect was stronger than the unstructured spatial effect, indicating that there is a similarity in the effect a particular region has on the risk of malaria and her neighbouring regions. This shows that there is the possibility of demographic, socio-economic, and geographical factors that go beyond the boundaries of the regions playing a significant role in childhood malaria. Also, the strong relationship between malaria and anaemia in Nigeria could be the result of the homogenous outcome of spatial effect on childhood malaria in the country, and this was explained by the inclusion of child's anaemic status.
The adjusted posterior odds ratio estimates (AOR) and 95% credible interval for the random effects incorporated in the Bayesian hierarchical logistic regression model are presented in Table 5. In 2010, the odds of anaemia increased significantly for children aged 3 (24 to 32 months), 4 (33 to 41 months), 5 (42 to 50 months) and 6 (51 to 59 months) relative to children aged 1 (6 to 14 months). The odds of anaemia increase significantly among children who have malaria or live in rural areas. In the same vein, regarding wealth index, children from the middle and rich household had a significant increase in the odds of anaemia relative to children from a poor households. On the other hand, the household that has electricity had non significantly lesser odds of anaemia www.nature.com/scientificreports/ compared to a household with no electricity. There was a non-significant decrease in the odds of anaemia among those who use well water relative to those who use tap/other and those who use pit toilet relative to those who use flush/other toilet. Furthermore, mothers with higher educational level (Secondary and Higher education) had lower odds of anaemia but not significant. The odds of anaemia increased significantly for female children than male children. The same applies to households that had a radio and television. No significant decrease in the odds of anaemia was recorded with respect to main roof material, main floor material and main wall material. In 2015, the odds of anaemia decreased significantly for children aged 3 (24 to 32 months), 4 (33 to 41 months), 5 (42 to 50 months) and 6 (51 to 59 months) relative to children aged 1 (6 to 14 months). Also, the odds of anaemia among children who have malaria as well live in rural areas and use well water and pit toilet increase significantly. On the other hand, regarding wealth index, electricity and concrete/ceramics floor, nonsignificant increase in the odds of anaemia was noted. The odds of anaemia among female children relative to male children and those who have television to those who do not have decreased non-significantly. With respect to main roof material, main floor material and main wall material and the household that have radio, the odds of anaemia did not decrease significantly. Figures 5 and 6 show the estimated mean, median, 25% quantile and 95% quantile of the structured and unstructured spatial effects on the log-odds of anaemia, respectively. Lower odds of anaemia were attributed to the northeast regions because they have a negative spatial effect, while a higher odd of anaemia was ascribed to the core north regions because they have a positive spatial effect. This is because most northern women are less educated and therefore are not well informed about this disease. Also, most northern women are rural dwellers and so lack the knowledge of anaemia protection role of modern housing. In comparison, the structured spatial effect that ranged from − 1.5 to 1.0 for mean, − 1.5 to 1.0 for median, − 2.5 to 0.5 for 25% quantile and − 0.5 to 1.5 for 95% quantile is estimated to be stronger than unstructured spatial effect that ranged from − 0.15 to 0.15 for mean, − 0.08 to 0.08 for median, − 0.8 to − 0.2 for 25% quantile and 0.1 to 0.8 for 95% quantile. Like in the previous maps, there is the similarity of factors that influences the risk of malaria which transcend boundaries among the neighbouring regions. In addition, the strong correlation between malaria and anaemia maybe because of the homogenous results of spatial effect on childhood anaemia. It was explained by adding a child's malaria status.

Discussion
In this study, a hierarchical Bayesian logistics regression model was adopted to investigate the spatial variation, socio-economic, demographic, and geographical factors of malaria and anaemia in children under age 5 years in Nigeria using two datasets. The spatial effect maps were generated from the available data, to identify the most endemic region. The result from this study is in line with the previous works 11,22,27,38 , that the risk of girls having malaria or anaemia is less, and an increase in wealth index and mother's educational level decreases a child's risk. It can be inferred that the more educated an individual is, the more aware and more understanding of health-related issues. Also, from the results, inefficient public health resource allocations and socio-economic conditions are strongly associated with geographical inequalities in health 27 . In the same vein, when individuals have a high income, access to good health care and nutritional food sources reduces the risk of the diseases. www.nature.com/scientificreports/ In both years, the type of place of residence is significantly associated with the two diseases. Children in rural areas tend to have higher rates of the diseases than their contemporaries 22,27,39 . In addition, female children had lower risk of having anaemia in both years while for malaria, female children had non-significant higher odds in 2010 and lower odds in 2015 compared to male children. This could be attributed to the customary practice that male children are most preferred by their fathers in Nigeria especially, in Igbo tribe than female children 40 . This could be the reason female children have the advantage of being breastfed for a longer period and also get better healthcare and nutrition, thereby improving their health status because they are left to be taken care of by their mothers 15 . There is a high possibility of denying the male children their mother's care which may result in blood loss from injury, etcetera and higher levels of parasitosis which agree with previous studies 22,27,41 . As seen in Fig. 6, there is a significant association between malaria and anaemia in the two years. This suggests that a high rate of anaemia in this country is caused by malaria 11,17 . Children with a better source of drinking water are less affected by malaria and anaemia. It could be said that a better source of drinking water ensures good health for children which increases their body immunity. In accordance, the type of toilet facilities has a significant association with malaria which in turn causes anaemia. This is because when there is poor sanitation, It is possible for the facility to give a conducive environment for the breeding of mosquitoes which increases the risk of malaria parasites and then result in anaemia 11 . In this study, wall, floor, and roof materials were found not to be significantly associated with malaria or anaemia. But these factors have a high correlation with malaria, www.nature.com/scientificreports/ for this reason, many of these factors on childhood anaemia will be accounted for if the child's malaria status is included 2,11,25 . It can be seen from our results that the vulnerability of under-5 malaria infection increases with age while anaemia infection decreases with age. For malaria, the reason could be that children within 6 to 23 months are protected through maternal immunity 8,27 . On the other hand, for anaemia, a disease controlling immunity develops as the child grows, as such, nearly all malaria infections are asymptotic by adolescents and adults, thus, there will be a decline in the prevalence of anaemia 17 . The reason for considering more than one disease in two different years is to ascertain if both diseases are affected by the same demographical, socio-economic, or geographical factors. From this study, the effect of unstructured spatially correlation is seen to be moderately weak in contrast to structured spatial effect, signifying  www.nature.com/scientificreports/ that the prevalence of malaria and anaemia in each of the years is the same among neighbouring states. In 2010, malaria and anaemia diseases were denser in all Nigerian states than in 2015. Also, in 2015, there is an obvious increase in mother's educational levels which is a major reason for this change. Notwithstanding the differences in the two years considered, there were similarities as seen in Figs. 1 and 2, both malaria and anaemia were predominant in North-West, North-Central and some part of South-East in the considered two datasets. Figure 7 shows the significance of non-linear effect of rural area on malaria and anaemia. It can be deduced from our results that there was a significant increase in malaria in rural areas in both years while for anaemia, the effect increased but dropped slightly at some point. This is because as malaria infection increases, it gets to a point where individuals develop a disease controlling immunity such that all those who have malaria are asymptomatic and this diminishes the prevalence of anaemia 17 .

Conclusion
The results from this study give clarity to the risk factors of malaria and anaemia in children between the age of 6 to 59 months and better understanding of strategies to mitigate the impacts of the diseases from a public health standpoint. For both diseases, children living in rural areas with illiterate mothers are the most affected. Malaria and anaemia control measures should study the spatial difference that is apparent in these regions. The odds of malaria and anaemia among children between the age of 6 to 59 months increase with child's age, mother's level of education, place of residence, source of drinking water, low income, type of toilet facility and presence of either of the diseases. Factors that contribute to spatial heterogeneity should be considered to focus on assessing local region-specific causes of anaemia and malaria in children.

Data availability
The data that support the findings of this study are available from Jecinta U. Ibeji, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Jecinta U. Ibeji. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.